Skip to content

feat: quality improvements — multi-model routing, incremental reviews, structured logging#35

Merged
haasonsaas merged 2 commits intomainfrom
feat/quality-improvements
Mar 11, 2026
Merged

feat: quality improvements — multi-model routing, incremental reviews, structured logging#35
haasonsaas merged 2 commits intomainfrom
feat/quality-improvements

Conversation

@haasonsaas
Copy link
Collaborator

@haasonsaas haasonsaas commented Mar 11, 2026

Summary

  • Fix panic-prone unwrap() in production paths (LLM adapter model parsing, verification regex captures)
  • 82 new tests across API (36), database (27), state (2), config (4), and incremental SHA tracking (2)
  • GitHub suggestion blocks with multi-line support (start_line/line) for one-click PR fixes
  • Incremental review (Incremental review (push-by-push delta) #16): push-by-push delta using last_reviewed_shas tracking and GitHub compare API with graceful fallback to full diff
  • Multi-model routing (Multi-model routing: triage with cheap models, review with frontier #26): ModelRole::Fast cascades model_fastmodel_weak → primary model for lightweight tasks (commit messages, PR summaries, titles)
  • Wire up dead code: InteractiveCommand → webhook handler, storage methods → API endpoints, model_name() → pipeline logging, detect_context_window → doctor command
  • Structured logging: all eprintln!/println! replaced with tracing::error!/warn!/info! in server modules
  • Remove unused code: Plugin trait, legacy Ollama types, inline OllamaShowRequest/Response
  • New API endpoints: DELETE /api/review/{id}, POST /api/reviews/prune

Closes #8, closes #16, closes #26

Test plan

  • cargo clippy -- -D warnings passes clean
  • cargo test passes — 859 tests (up from 777), 0 failures
  • Verify incremental review flow: open PR → push → verify compare API diff is used
  • Verify model_fast config cascades correctly in production
  • Test @diffscope commands in PR comments trigger webhook handler

🤖 Generated with Claude Code


Open with Devin

…ws, structured logging, tests

- Fix panic-prone unwrap() calls in production paths (adapters, verification)
- Add 82 new tests: API layer (36), database layer (27), state (2), config (4), incremental SHA (2), and more
- GitHub suggestion blocks: multi-line support with start_line/line for one-click PR fixes
- Incremental review: push-by-push delta using last_reviewed_shas and compare API
- Multi-model routing: ModelRole::Fast cascades model_fast → model_weak → primary
- Wire up dead code: InteractiveCommand → webhook, storage → API, model_name → pipeline, detect_context_window → doctor
- Remove unused Plugin trait, legacy Ollama types
- Replace all eprintln!/println! with tracing::error!/warn!/info! in server modules
- Add delete_review and prune_reviews API endpoints
- Reduce cloning by inlining OllamaShowRequest/Response

Closes #8, #16, #26

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment on lines +590 to +591
let limit = (per_page * 2) as i64; // fetch extra to merge
if let Ok(stored) = state.storage.list_reviews(limit, 0).await {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 list_reviews storage query fetches too few rows, breaking pagination for pages beyond page 1

The new list_reviews implementation fetches at most per_page * 2 reviews from the storage backend with a fixed offset of 0 (src/server/api.rs:590-591). For any page beyond the first, this is insufficient. For example, if there are 0 in-memory reviews and 100 reviews in storage, requesting page 3 with per_page=20 (items 41-60) will return an empty result: the storage query only returns the first 40 items (20*2), so after sorting and slicing at index 40, nothing is available. The correct approach would be to fetch at least page * per_page items from storage (or all items up to a reasonable cap) to ensure later pages are populated.

Suggested change
let limit = (per_page * 2) as i64; // fetch extra to merge
if let Ok(stored) = state.storage.list_reviews(limit, 0).await {
let limit = (page * per_page).max(per_page * 2) as i64; // fetch enough to cover requested page
if let Ok(stored) = state.storage.list_reviews(limit, 0).await {
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +640 to +641
let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;
let max_count = params.max_count.unwrap_or(1000);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Negative max_age_days in prune_reviews can delete all reviews

The prune_reviews endpoint accepts user-supplied max_age_days as Option<i64> and multiplies it directly by 86400 to get max_age_secs (src/server/api.rs:640). A negative value (e.g. max_age_days=-1max_age_secs=-86400) means every review's age exceeds the threshold, causing all reviews to be pruned. This is a data-loss risk since the endpoint has no authentication and accepts arbitrary query parameters.

Suggested change
let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;
let max_count = params.max_count.unwrap_or(1000);
let max_age_secs = params.max_age_days.unwrap_or(30).max(1) * 86400;
let max_count = params.max_count.unwrap_or(1000).max(1);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +636 to +653
pub async fn prune_reviews(
State(state): State<Arc<AppState>>,
Query(params): Query<PruneParams>,
) -> Json<serde_json::Value> {
let max_age_secs = params.max_age_days.unwrap_or(30) * 86400;
let max_count = params.max_count.unwrap_or(1000);

match state.storage.prune(max_age_secs, max_count).await {
Ok(pruned) => {
info!("Pruned {} old reviews", pruned);
Json(serde_json::json!({ "ok": true, "pruned": pruned }))
}
Err(e) => {
warn!("Prune failed: {}", e);
Json(serde_json::json!({ "ok": false, "error": e.to_string() }))
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 prune_reviews endpoint only prunes storage backend, leaving in-memory state inconsistent

The new prune_reviews endpoint at src/server/api.rs:636-653 calls state.storage.prune(...) but does not prune the in-memory state.reviews HashMap. After a prune, the list_reviews endpoint will still return pruned reviews from the in-memory portion (added at src/server/api.rs:584-587), and get_review will also find them in memory (checked first at src/server/api.rs:563-567). The in-memory and storage views become inconsistent until the server restarts or in-memory reviews are naturally evicted by AppState::prune_old_reviews.

Prompt for agents
In src/server/api.rs, the prune_reviews function (lines 636-653) should also prune reviews from the in-memory state.reviews HashMap to stay consistent with what was pruned from storage. After the storage.prune() call succeeds, add logic to also remove matching reviews from state.reviews (e.g., reviews older than max_age_secs and over max_count). You could reuse or extend AppState::prune_old_reviews from src/server/state.rs:379 to accept the same parameters.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

…apter

- Fix list_reviews pagination: fetch enough rows from storage to cover
  the requested page, not just 2x per_page
- Fix prune_reviews: also prune in-memory state, clamp max_age_days
  and max_count to >= 1 to prevent negative values deleting all reviews
- Avoid redundant adapter allocation in smart_review when fast model
  matches primary model — reuse existing adapter reference
- Run cargo fmt to fix CI formatting failure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Context window detection lost model_info fallback, causing missed detection for many Ollama models

The old detect_model_context_window in src/commands/doctor.rs had two fallback strategies for detecting context window size: (1) parsing num_ctx from the parameters string, and (2) checking model_info for context_length, llama.context_length, and general.context_length. The replacement OfflineModelManager::detect_context_window (src/core/offline.rs:284-304) only implements strategy (1) via parse_num_ctx_from_params. Many Ollama models expose their context window through model_info fields rather than the parameters string, so these models will now show no context window information in diffscope doctor.

(Refers to lines 300-304)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

drop(config);

match crate::adapters::llm::create_adapter(&model_config) {
Ok(adapter) => match cmd.execute(adapter.as_ref(), None).await {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 issue_comment webhook handler passes None diff to review command, making @diffscope review non-functional

The new issue_comment webhook handler at src/server/github.rs:504 calls cmd.execute(adapter.as_ref(), None), always passing None for diff_content. When the command is @diffscope review, execute_review (src/core/interactive.rs:79) checks if let Some(diff) = diff_content and returns "No diff content available for review." when it's None. Since the handler never fetches the PR diff from GitHub before executing, the review command is entirely non-functional via webhook comments — it will always respond with that error message.

Prompt for agents
In src/server/github.rs, inside the issue_comment handler (around line 493-511), before executing LLM-based commands like Review or Explain, fetch the PR diff from GitHub. The issue payload contains issue.pull_request.url if the issue is a PR. Use this to detect if it's a PR, fetch the diff via the GitHub API (similar to how the pull_request handler does it via fetch_full_pr_diff), and pass it to cmd.execute(). For non-PR issues or if the diff fetch fails, pass None as a fallback. Also, you'll need to resolve the auth_token properly (see BUG-0002) to be able to fetch the diff.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +514 to +527
if let Some(ref auth) = token {
let comment_url = format!(
"https://api.github.com/repos/{}/issues/{}/comments",
repo, issue_number
);
let _ = state
.http_client
.post(&comment_url)
.header("Authorization", format!("Bearer {}", auth))
.header("User-Agent", "DiffScope")
.json(&serde_json::json!({ "body": response_body }))
.send()
.await;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 issue_comment webhook handler ignores GitHub App auth, so responses are never posted for App deployments

In the pull_request handler, when a GitHub App is configured (with github_app_id and github_private_key), the code obtains an installation token via get_installation_token(). However, the new issue_comment handler at src/server/github.rs:514 only checks if let Some(ref auth) = token, where token is config.github_token (a PAT). For GitHub App deployments — which CLAUDE.md mandates as the recommended auth method — github_token is typically None, so the response comment will silently never be posted. The handler should obtain an installation token from the webhook payload's installation.id, just like the pull_request handler does.

Prompt for agents
In src/server/github.rs, inside the issue_comment handler (around line 467-536), add GitHub App installation token resolution similar to the pull_request handler. Extract the installation_id from payload["installation"]["id"], and if github_app_id and private_key are available, call get_installation_token() to get an auth token. Use that token (falling back to the PAT token) when posting the response comment at line 514. The auth resolution logic can be extracted into a shared helper to avoid duplication with the pull_request handler.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +926 to +930
ModelRole::Fast => self
.model_fast
.as_deref()
.or(self.model_weak.as_deref())
.unwrap_or(&self.model),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Fast model role defaults to smaller models, violating CLAUDE.md convention

CLAUDE.md states: "Use frontier models for reviews — never default to smaller models." The new ModelRole::Fast fallback chain at src/config.rs:926-930 is model_fast -> model_weak -> primary. When model_fast is not set but model_weak is configured (e.g., claude-haiku), the Fast role silently defaults to the smaller/cheaper Weak model for PR summaries, commit messages, PR titles, and diagram generation. These are all part of the review pipeline. While the core code analysis still uses the frontier model, this fallback chain means the review pipeline as a whole can default to smaller models without the user explicitly opting in via model_fast.

Suggested change
ModelRole::Fast => self
.model_fast
.as_deref()
.or(self.model_weak.as_deref())
.unwrap_or(&self.model),
ModelRole::Fast => self
.model_fast
.as_deref()
.unwrap_or(&self.model),
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@haasonsaas haasonsaas merged commit c8af0ed into main Mar 11, 2026
6 checks passed
@haasonsaas haasonsaas deleted the feat/quality-improvements branch March 11, 2026 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant